GVC: a graph-based Information Retrieval Mode

نویسندگان

  • Quoc Dinh Truong
  • Taoufiq Dkaki
  • Josiane Mothe
  • Pierre-Jean Charrel
چکیده

GVC is a new information retrieval model that is based on Graph Vertices Comparison (GVC). It implements a new similarity measure to compare documents and users' queries based on graph matching. In this model, graphs are composed of two types of nodes. Documents, queries and indexing terms are viewed as vertices of this bipartite graph where each edge goes from a document or a query –first type of nodesto an indexing term – second type of nodes-. Edges reflect the relationship that exists between documents or queries on the one hand and indexing terms on the other hand; they are set according to the tf.idf principal. Our method implements similarity propagation over graph edges using an iterative process. We evaluate the model using 4 different collections (TREC 2004 Novelty Track, CISI, Cranfield and Medline). We show that considering precision at 5 documents, GVC outperforms Okapi model from 9% to 62%, depending on the collections.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GVC: a graph-based Information Retrieval Model

GVC is a new information retrieval model that is based on Graph Vertices Comparison (GVC). It implements a new similarity measure to compare documents and users' queries based on graph matching. In this model, graphs are composed of two types of nodes. Documents, queries and indexing terms are viewed as vertices of this bipartite graph where each edge goes from a document or a query –first type...

متن کامل

Factors Affecting Student's Scientific Information Retrieval based on Fuzzy Logic Method Compared to Traditional Method

Background and aim: The aim of this study was to identify the factors affecting on students' performance in information retrieval based on fuzzy logic method compared to traditional method. Materials and methods: This survey-descriptive study was performed using quantitative approach. The research population was 34 PhD students, and the researcher-made questionnaire was used. Data were analyzed...

متن کامل

An Effective Path-aware Approach for Keyword Search over Data Graphs

Abstract—Keyword Search is known as a user-friendly alternative for structured languages to retrieve information from graph-structured data. Efficient retrieving of relevant answers to a keyword query and effective ranking of these answers according to their relevance are two main challenges in the keyword search over graph-structured data. In this paper, a novel scoring function is proposed, w...

متن کامل

The generalized vertex cover problem and some variations

In this paper we study the generalized vertex cover problem (GVC), which is a generalization of various well studied combinatorial optimization problems. GVC is shown to be equivalent to the unconstrained binary quadratic programming problem and also equivalent to some other variations of the general GVC. Some solvable cases are identified and approximation algorithms are suggested for special ...

متن کامل

Real-Time intrusion detection alert correlation and attack scenario extraction based on the prerequisite consequence approach

Alert correlation systems attempt to discover the relations among alerts produced by one or more intrusion detection systems to determine the attack scenarios and their main motivations. In this paper a new IDS alert correlation method is proposed that can be used to detect attack scenarios in real-time. The proposed method is based on a causal approach due to the strength of causal methods in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008